Introduction
Materials and Methods
Results
Discussion
Conclusion
Introduction
Materials and Methods
Results
Discussion
Conclusion
Obtain data set
Data Wrangling
Exploratory Data Analysis
Analysis and PCA Modeling
Logistic Regression model and Machine Learning
Shiny App
Working collaboratively using RStudio Cloud and Github
Data is well seperated so classification seems to be feasible.
Logistic regression done using tidymodels
Perform binary classification
Parameters: other_diseases, height, weight, famHist2DBin, famHist1DBin
Assessing the predictive ability of the model
Link (source = https://hcehlers.shinyapps.io/DiaPredict/)
Limited by the data set: location, race and habitat of source data limit the global usability of the model
Unique observation: Height seems to impact the likelihood of diabetes
The accuracy of our model can be increased with added parameters and data points
Scope for cross platforming and integrated studies
It was feasible to do data analysis and obtain biological insights about our data set
We conclude that height and weight are important indicators of T1 diabetes
We expected family history to be more important
More descriptive data would have made it easier to conclude and test hypotheses